EIGHT INTERNATIONAL WORKSHOP ON ACADEMIC INFORMATION NETWORKS AND SYSTEMS (WAINS 8) Collaboration on Named Entity Discovery in Thai Agricultural Texts

نویسندگان

  • Asanee KAWTRAKUL
  • Nigel COLLIER
  • Koichi TAKEUCHI
  • Kenji ONO
  • Kitsana WAIYAMAI
چکیده

This paper outlines our collaboration on the construction of a named entity recognizer for the Thai language. This tool will support natural language processing applications in Thai such as high precision information retrieval, summarization and machine translation. Named entity recognition has been quite successfully applied to English due to the interest generated by evaluation conferences such as MUC (Message Understanding Conference) in the 1990s and IREX for Japanese. Currently no such tool exists for the Thai language. Initially we will apply models based on machine learning that have been successful for English and test their applicability to the Thai language. The application domain we will focus on is agriculture and we intend to build a test collection to train and evaluate our system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P2P Network Trust Management Survey

Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...

متن کامل

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

Named Entity Recognition on Twitter for Turkish using Semi-supervised Learning with Word Embeddings

Recently, due to the increasing popularity of social media, the necessity for extracting information from informal text types, such as microblog texts, has gained significant attention. In this study, we focused on the Named Entity Recognition (NER) problem on informal text types for Turkish. We utilized a semi-supervised learning approach based on neural networks. We applied a fast unsupervise...

متن کامل

A Sudy on Information Privacy Issue on Social Networks

In the recent years, social networks (SN) are now employed for communication and networking, socializing, marketing, as well as one’s daily life. Billions of people in the world are connected though various SN platforms and applications, which results in generating massive amount of data online. This includes personal data or Personally Identifiable Information (PII). While more and more data a...

متن کامل

NeuroNER: an easy-to-use program for named-entity recognition based on neural networks

Named-entity recognition (NER) aims at identifying entities of interest in a text. Artificial neural networks (ANNs) have recently been shown to outperform existing NER systems. However, ANNs remain challenging to use for non-expert users. In this paper, we present NeuroNER, an easyto-use named-entity recognition tool based on ANNs. Users can annotate entities using a graphical web-based user i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003